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In finance, one usually deals not with prices but with growth rates R, defined as the difference in logarithm 
between two consecutive prices. Here we consider not the trading volume, but rather the volume growth rate 
R, the difference in logarithm between two consecutive values of trading volume. To this end, we use several 
methods to analyze the properties of volume changes \R\, and their relationship to price changes \R\. We 
analyze 14, 981 daily recordings of the S&P 500 index over the 59-year period 1950-2009, and find power- 
law cross-correlations between \R\ and \R\ using detrended cross-correlation analysis (DCCA). We introduce 
a joint stochastic process that models these cross-correlations. Motivated by the relationship between \R\ and 
\R\, we estimate the tail exponent a of the probability density function P(|7?|) ~ |J?| _1-Q for both the S&P 
500 index as well as the collection of 1819 constituents of the New York Stock Exchange Composite index on 
17 July 2009. As a new method to estimate a, we calculate the time intervals r q between events where R > q. 
We demonstrate that f q , the average of r q , obeys f q ~ q a . We find a m 3. Furthermore, by aggregating all r q 
values of 28 global financial indices, we also observe an approximate inverse cubic law. 



There is a saying on Wall Street that "It takes volume to 
move stock prices." A number of studies have analyzed the 
relationship between price changes and the trading volume in 
financial markets fT Hl4l . Some of these studies |Qj[3]{6| nave 
found a positive relationship between price change and the 
trading volume. In order to explain this relationship, Clarke 
assumed that the daily price change is the sum of a random 
number of uncorrelated intraday price changes [3|, so pre- 
dicted that the variance of the daily price change is propor- 
tional to the average number of daily transactions. If the num- 
ber of transactions is proportional to the trading volume, then 
the trading volume is proportional to the variance of the daily 
price change. 

The cumulative distribution function (cdf) of the absolute 
logarithmic price change \R\ obeys a power law 

P(\R\ > x) ~ x- a . (1) 

It is believed lfT5l - [r8l that a ~ 3 ("inverse cubic law"), 
outside the range a < 2 characterizing a Levy distribution 
|[T8l[T9l . A parallel analysis of Q, the volume traded, yields a 
power law B0H281 

P(Q > x) ~ 2T Q «. (2) 

To our knowledge, the logarithmic volume change — R and 
its relation to the logarithmic price change R — has not been 
analyzed, and this analysis is our focus here. 

L DATA ANALYZED 

A. We analyze the S&P500 index recorded daily over the 
59-year period January 1950 - July 2009 (14,981 total 
data points). 

B. We also analyze 1819 New York Stock Exchange 



(NYSE) Composite members comprising this index on 
17 July 2009, recorded at one-day intervals (6,794,830 
total data points). Both data sets are taken from 
http://finance.yahoo.com Different companies com- 
prising the NYSE Composite index have time series 
of different lengths. The average time series length is 
3,735 data points, the shortest time series is 10 data 
points, while the longest is 11,966 data points. If the 
data display scale independence, then the same scaling 
law should hold for different time periods. 

C. We also analyze 28 worldwide financial indices from 
http://finance.yahoo.com recorded daily. 

(i) 11 European indices (ATX, BEL20, CAC 40, 
DAX, AEX General, OSE All Share, MIBTel, 
Madrid General, Stockholm General, Swiss Mar- 
ket, FTSE 100), 

(ii) 12 Asian indices (All Ordinaries, Shanghai Com- 
posite, Hang Seng, BSE 30, Jakarta Composite, 
KLSE Composite, Nikkei 225, NZSE 50, Straits 
Times, Seoul Composite, Taiwan Weighted, TA- 
100), and 

(iii) 5 American and Latin American indices (MerVal, 
Bovespa, S&P TSX Composite, IPC, S&P500 In- 
dex). 

For each of the 1819 companies and 28 indices, we calculate 
over the time interval of one day the logarithmic change in 
price S(t), 
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and also the logarithmic change in trading volume Q(t) 

(Q{t + l 



R t = ln 



V Q(t) 



(4) 



For each of the 3694 time series, we also calculate the absolute 
values |i? t | and |i? t | and define the "price volatility" |[30l and 
"volume volatility," respectively, 



and 



V R = 



Vr 



\Rt 



(5) 



(6) 



where a R = ((|i? t | 2 ) - (\R t \) 2 ) 1/2 and a k = ((|i? t | 
(l-Rtl) 2 ) 1 / 2 are the respective standard deviations. 



II. METHODS 

Recently, several papers have studied the return intervals 
r between consecutive price fluctuations above a volatility 
threshold q. The pdf of return intervals P q (r) scales with the 
mean return interval r as I3TH331 



P ? (r)=r-V(^), 



(7) 



where f(x) is a stretched exponential. Similar scaling was 
found for intratrading times (case q = 0) in Ref. |34| . In this 
paper we analyze either (i) separate indices or (ii) aggregated 
data mimicking the market as a whole. In case (i), e.g., the 
S&P500 index for any q, we calculate all the r values between 
consecutive index fluctuations and calculate the average return 
interval r. In case (ii), we estimate average market behavior, 
e.g., by analyzing all the 500 members of the S&P500 index. 
For each q and each company we calculate all r q values and 
their average. 

For any given value of Q in order to improve statistics, we 
aggregate all the r values in one data set and calculate r. If 
the pdf of large volatilities is asymptotically power-law dis- 
tributed, P{\x\) ~ la;!" 1 -", and P(\x\) ~ we pro- 
pose a novel estimator which relates the mean return intervals 
r q with a, where T q is calculated for both case (i) and case 
(ii). Since on average there is one volatility above threshold q 
for every T q volatilities, then 



1/f 



P{\x\)d\x\=P(\x\>q)~q- 



(8) 



the a value obtained from P(|i?| > Q), using an alternative 
method of Hill [35 1. If the pdf follows a power law P(x) ~ 
Ax~( 1+a \ we estimate the power-law exponent a by sorting 
the normalized returns by their size, x\ > x^ > ... > xn, 
with the result l35ll 



a=(N-l) 



N-l 



XN 



(10) 



where N — 1 is the number of tail data points. We employ 
the criterion that N does not exceed 10% of the sample size 
which to a good extent ensures that the sample is restricted to 
the tail part of the pdf (36). 

A new method based on detrended covariance, detrended 
cross-correlations analysis (DCCA), has recently been pro- 
posed [ 37 1 . To quantify power-law cross-correlations in 
non-stationary time series, consider two long-range cross- 
correlated time series {yi} and {y'^} of equal length N, and 
compute two integrated signals Yu = ^j=i Vi an d = 
Si=i Up wnere k = 1, . . . , N. We divide the entire time 
series into N — n overlapping boxes, each containing n + 1 
values. For both time series, in each box that starts at i and 
ends at i + n, define the "local trend" to be the ordinate of a 
linear least-squares fit. We define the "detrended walk" as the 
difference between the original walk and the local trend. 

Next calculate the covariance of the residuals in each box 
% CCA (n,i) ee ^ESft-^-^). Calcu- 
late the detrended covariance by summing over all overlap- 
ping N — n boxes of size n, 



N—n 



2 

DCCA 



(n,i). 



(11) 



If cross-correlations decay as a power law, the correspond- 
ing detrended covariances are either always positive or al- 
ways negative, and the square root of the detrended covariance 
grows with time window n as 



-FbcCA(n) oc 



(12) 



where Adcca is the cross-correlation exponent. If, however, 
the detrended covariance oscillates around zero as a function 
of the time scale n, there are no long-range cross-correlations. 

When only one random walk is analyzed = Y£), the de- 
trended covariance FuccA( n ) reduces to the detrended vari- 
ance 



F BFA {n) cx n ADFA 
used in the DFA method [ 



(13) 



For both case (i) and case (ii), we calculate r q for varying q, 
and obtain an estimate for a through the relationship 



T q (X q 



(9) 



We compare our estimate for a in the above procedure with 



m. RESULTS OF ANALYSIS 

We first investigate the daily closing values of the S&P500 
index adjusted for stock splits together with their trading vol- 
umes. In Fig. 1(a), we show the cross-correlation function 
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between \R t \ and |P f | and the cross-correlation function be- 
tween R t and R t . The solid lines are 95% confidence inter- 
val for the autocorrelations of an i.i.d. process. The cross- 
correlation function between R t and R t is practically negli- 
gible and stays within the 95% confidence interval. On the 
contrary, the cross-correlation function between |i? t | and \R t \ 
is significantly different than zero at the 5% level for more 
than 50 time lags. 

In Fig. 1(b) we find, by using the DFA method GEES), that 
not only \R t \ ||30ll40l , but also \R t \ exhibit power-law auto- 
correlations. As an indicator that there is an association be- 
tween \R t \ and \Rt\, we note that during market crashes large 
changes in price are associated with large changes in market 
volume. To confirm co-movement between |P t | and |i? t |, in 
Fig. 1(b) we demonstrate that \R t \ and \R t \ are power-law 
cross-correlated with the DCCA cross-correlation exponent 
(see Methods section) close to the DFA exponent IT381 l39l 
corresponding to |i? t |. Thus, we find the cross-correlations 
between |i?t+ n | and |i? t | not only at zero time scale (n = 0), 
but for a large range of time scales. 

Having analyzed cross-correlations between corresponding 
(absolute) changes in prices and volumes, we now investigate 
the pdf of the absolute value of R t of Eq. In order to test 
whether exponential or power-law functional form fits better 
the data, in Figs. 2(a) and (b) we show the pdf P(R) in both 
linear-log and log-log plot. In Fig. 2(a) we see that the tail sub- 
stantially deviates from the central part of pdf which we fit by 
exponential function. In Fig. 2(b) we find that the tails of the 
pdf can be well described by a power law R 1+a with exponent 
a = 3 ± 0.16, which supports an inverse cubic law — virtually 
the same as found for average stock price returns lfT5HT71 . and 
individual companies ifTSl . 

In order to justify the previous finding, we employ two 
additional methods. First, we introduce a new method [de- 
scribed in Methods by Eqs. ([8]) and for a single finan- 
cial index. We analyze the probability that a trading volume 
change R has an absolute value larger than a given threshold, 
q. We analyze the time series of the S&P500 index for 14,922 
data points. First, we define different thresholds, ranging from 
2<T to 8a. For each q, we calculate the mean return interval, 
f . In Fig. 2(c) we find that q and f follow the power law of 
Eq. where a = 2.97 ± 0.02. We note that the better is the 
power law relation between f q and q in Fig. 2(c), the better is 
the power-law approximation P(|P| > x) f=s x~ a for the tail 
of the pdf P(\R\). In order to confirm our finding that P(|P|) 
follows a power law P(|i?|) sa where a ~ 3 obtained 

in Fig. 2(a) and 2(b), we also apply a third method, the Hill 
estimator 11351 . to a single time series of the SP500 index. We 
obtain a = 2.80 ± 0.07 consistent with the results in Fig. 2(a) 
and 2(b). 

Next, by using the procedure described in case (ii) of Meth- 
ods, we analyze 1,819 different time series of Eq. (j4j), each 
representing one of the 1,819 members of the NYSE Com- 
posite index. For each company, we calculate the normalized 
\Rt\ volatility of trading volume changes of each company 
(see Eq. ([6])). In Figs. 3(a) and (b) we show the pdf in both 
linear-log and log-log plot. In Fig. 3(a) we see that the broad 
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FIG. 1: Auto-correlations and cross-correlations in absolute val- 
ues of price changes \R t \ of Eq. |5| and trading- volume changes 
\Rt\ of Eq. for daily returns of the S&P500 index, (a) The 
cross-correlation function C(R, R) between R and R, and the 
cross-correlation function C(|7?|, ~R|) between \R\ and \R\. (b) 
For R(t), and R(t), we show the rms of the detrended variance 
FoFA(n) for \R\ and \R\ and also the rms of the detrended co- 
variance 1371 . Pdcca('i). The two DFA exponents X\jt\ and 
imply that power-law auto-correlations exist in both \R\ and \R\. 
The DCCA exponent implies the presence of power-law cross- 
correlations. Power-law cross-correlations between \R\ and \R\ im- 
ply that current price changes depend upon previous changes, but 
also upon previous volume changes, and vice versa. 



central region of the pdf, from 2 a up to 15 a, is fit by an 
exponential function. However, the far tail deviates from the 
exponential fit. In Fig. 3(b) we find that the tails of the pdf 
from 15 a to up to 25 a, are described by a power law P 1+ " 
with exponent a — 4.65 ± 1.00. 

Then, by employing the method described by Eqs. ([8]l and 
(|9]l we define different thresholds, q, ranging from 2a to 8a 
(different range than in Fig. 3(a)). We choose the lowest q 
equal to 2 since we employ the criterion that N does not ex- 
ceed 10% of the sample size [36|. For each q, and each com- 
pany, we calculate the time series of return intervals, r q . For 
a given q, we then collect all the r values obtained from all 
companies in one unique data set — mimicking the market as 
a whole — and calculate the average return interval, f q . In 
Fig. 3(c) we find that q and f q follow an approximate inverse 
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FIG. 2: Pdf P(\R\) of absolute value of differences in logarithm of 
trading volume, 7?, of Eq. Q for the S&P500 index, (a) A log-liner 
plot P(|7?|). The solid line is an exponential fit. The tail part of 
pdf deviates from the fit in the central part, (b) Log-log plot of the 
pdf. The broad tail part can be explained by a power law R +a with 
a — 3 ± 0.16. (c) For the absolute values of changes in trading 
volume (see Eq. |4]l the average return interval r vs. threshold q (in 
units of standard deviation a) follows a power law, with exponent 
a — 2.97 ± 0.02. The power law is consistent with inverse cubic 
law of the pdf. 
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FIG. 3: Pdf of absolute value of differences in logarithm of trad- 
ing volume, R, of Eq. Q for the members of the NYSE Composite 
index. We use the method described in the Methods section — case 
(ii) — for normalized volatilities of Eq. ([6j. (a) From la to 15cr we 
show the linear-log plot of the pdf P{R). The straight line is expo- 
nential fit. The far tail of pdf deviates from the fit in the central region 
of pdf. (b) Log-log plot of pdf from 15cr to 25a. The tail part of the 
pdf can be explained by a power law R 1+a with a — 4.2 ± 0.26. 
(c) For the absolute values of changes in trading volume [see Eq. |4](] 
we show the average return interval f q versus threshold q (in units of 
a standard deviation). Up to 8<r, we show a power law with exponent 
ct = 3.11±0.12 which leads to the inverse cubic law. 



cubic law of Eq. Q, where a = 3.1 ± 0.11. Our method is 
sensitive to data insufficiency, so we show the results only up 
to 8 a. Clearly, this method gives the a value for the market as 
a whole, not the a values for particular companies. By joining 

all the normalized volatilities \R t \ obtained from 1,819 time 
series in one unique data set, we estimate Hill's exponent of 



Eq. ( 10 1, a = 2.82 ± 0.003, consistent with the value of ex- 
ponent obtained using the method of Eqs. (jH} and (p). 

In the previous analysis we consider time series of the com- 
panies comprising the NYSE Composite index of different 
lengths (from 10 to 1 1,966 data points). In order to prove that 
the Hill exponent of Eq. 



10 1 is not affected by the shortest 



time series, next we analyze only the time series longer than 
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3,000 data points (1,128 firms in total). For the Hill exponent 
we obtain a = 2.81 ± 0.003, that is the value practically the 
same as the one (a = 2.82 ± 0.003) we obtained when short 
time series were considered as well. 

We perform the method of Hill [35], and the method of 
Eqs. ^ and also for the 500 members of the S&P500 
index comprising the index in July 2009. There are in total 
2, 601, 247 data points for R of Eq. (|6J. For the thresholds, 
q, ranging from 2a to lOer, we find that q and r follow for 
this range an approximate inverse cubic law of Eq. ([9]), where 
a = 3.1 ± 0.12. We estimate the Hill exponent of Eq. ([10} to 
be a = 2.86 ± 0.005, with the lowest Q = 2. 

In order to find what is the functional form for trading- 
volume changes at the world level, we analyze 28 worldwide 
financial indices using the procedure described in Methods 
[case(ii)]. For each q, and for each of the 28 indices, we calcu- 
late the values for the return interval r. Then for a given q, we 
collect all the r values obtained for all indices and calculate 
the average return interval f q . In Fig. 4(a), we find a functional 
dependence between q and r which can be approximated by a 
power law with exponent a = 2.41 ± 0.06. We also calculate 
t vs. q for different levels of financial aggregation. 

Finally, in addition to trading-volume changes, we employ 
for stock price changes our procedure for identifying power- 
law behavior in the pdf tails described in Methods [case (ii)]. 
The pdf of stock price changes, calculated for an "average" 
stock, is believed to follow P(R) ~ where a « 3, 

as empirically found for wide range of different stock markets 

macro. 

Next we test whether this law holds more generally. To this 
end, we analyze the absolute values of price changes, |i?t| 
[see Eq. (BJ], for five different levels of financial aggregation: 
(i) Europe, (ii) Asia, (iii) North and South America, (iv) the 
world without the USA, and (v) the entire world. For each 
level of aggregation, we find that the average return interval 

~ . . „-3 



IV. MODEL 
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FIG. 4: Power-law correlations for world-wide financial indices in 
(a) absolute values of price changes (\R\) and (b) absolute values 
of trading-volume changes (ji?j). We use the method described by 
Eqs. (7)-(8). (a) The average return interval f vs. threshold q (in 
units of standard deviation) for absolute values of trading-volume 
changes. For each of 28 worldwide financial indices, we calculate 
the corresponding f q values. Then we collect all the r values ob- 
tained from different indices, and show f q versus q. Up to 8 standard 
deviations, we find a power law with exponent a — 2.41 ± 0.06. (b) 
The average return interval f q vs. threshold q for absolute values of 
price changes [see Eq. l|3j] for different levels of aggregation. For 
each of five different types of aggregation reported, we find that f 
versus q exhibits a power law with an exponent very close to a = 3. 



In order to model long-range cross-correlations between 
|i? t | and \Rt\, we introduce a new joint process for price 
changes 



e* = o- t r]t 

a\ =u + a e?_! + /3 a^_ 1 + 7 l\_ x 
and for trading-volume changes 

h = outfit 

a 2 t =u + a el_ x + ^ of^ + 7 e 2 _ v 



(15) 



(16) 



(17) 



If 7 = 7 = 0, Eqs. ( 14 H 17 1 reduce to two separate processes 
of Ref. BP . Here r/ t and f\ t are two i.i.d. stochastic pro- 
cesses each chosen as Gaussian distribution with zero mean 



and unit variance. In order to fit two time series, we define 
free parameters ui, a, f3, 7, Q, a, j3, 7, which we assume to 



be positive [41 1. The process of Eqs. ( 14 1-( 17 1 is based on 



(14) the generalized autoregressive conditional heteroscedasticity 



(GARCH) process (obtained from Eqs. ( 14 1-( 15 1 when 7 = 0) 



introduced to simulate long-range auto-correlations through 
P 7^ 0. The GARCH process also generates the power-law 
tails as often found in empirical data [see lfl5TtT8l . and also 
Fig. 2(b)]. In the process of Eqs. ( 14i-( 17 1 we obtain cross- 
correlations since time-dependent standard deviation at for 
price changes depends not only on its past values (through 
a and /3), but also on past values of trading-volume errors (7). 
Similarly, <r t for trading-volume changes depends not only on 
its past values (through a and ft), but also on past values of 
price errors (7). 



For the joint stochastic process of Eqs. ( 14 1-( 17 1 with /? = 
P = 0.65, a — a — 0.14, 7 = 7 = 0.2, we show in 
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FIG. 5: Cross-correlations bet ween tw o time seriesgenerated from 
the stochastic process of Eqs. jl4|l7) , with /? = /3 = 0.65, a = 
a — 0.14, 7 = 7 = 0.2, and tj = to — 0.01. In panel (a) we 
show the time series e and e of Eqs. |14|17| , where the latter time 
series is shifted for clarity. These two time series follow each other 
due to the terms 7 / and 7 7^ 0. In panel (b) we show the auto- 
correlation function A(n) for \e t \ and the cross-correlation function 
C(|e|, |e|). The 95% confidence intervals for no cross-correlations 
are shown (solid lines) along with the best exponential fit of A(n) 
(dotted curve). 
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and 



Fig. 5(a) the cross-correlated time series of Eqs. ( 15 1 and 
In Fig. 5(b) we show the auto-correlation function for |e t 
the cross-correlation function which practically overlap due to 
the choice of parameters. 

If stationarity is assumed, we calculate the expectation of 
Eq. (fl5]l and (171 and since, e.g., E(er t 2 ) = E(cr|_ 1 ) = 
E(e(_l77 = °o> we obtain Og(l — a — (3) = us + 7<Tq, an d 
similarly ctq(1 — a — 0) = u + 70Q. So, stationarity gen- 
erally assumes that a + < 1 as found for the GARCH 
process BTIl . However, for the choice of parameters in the 
previous paragraph for which 00 = &o stationarity assumes 



that 0-5(1 - a - p - 7) 



This result explains why the 



persistence of variance measured by a + /3 should become 
negligible in the presence of volume in the GARCH process 
ifTOl . In order to have finite <7q, we must assume a+[3+j < 1. 



It is also possible to consider IGARCH and FIGARCH pro- 
cesses with joint processes for price and volume change, a po- 
tential avenue for future research ||46l . 



V. SUMMARY 

In order to investigate possible relations between price 
changes and volume changes, we analyze the properties of 
\R\, the logarithmic volume change. We hypothesize that 
the underlying processes for logarithmic price change \R\ and 
logarithmic volume change \R\ are similar. Consequently, we 
use the traditional methods that are used to analyze changes 
in trading price to analyze changes in trading volume. Two 
major empirical findings are: 

(i) we analyze a well-known U.S. financial index, the 
S&P500 index over the 59-year period 1950-2009, and find 
power-law cross-correlations between \R\ and \R\. We find 
no cross-correlations between R and R. 

(ii) we demonstrate that, at different levels of aggregation, 
ranging from the S&P500 index, to aggregation of different 
world-wide financial indices, \R\ approximately follows the 
same cubic law as \R\. Also, we find that the central region of 
the pdf, P(\R\), follows an exponential function as reported 
for annually recorded variables, such as GDP Il42ll43l . com- 
pany sales [44 1, and stock prices P31 . 

In addition to empirical findings, we offer two theoretical 
results: 

(i) to estimate the tail exponent a for the pdf of \R\, we 
develop an estimator which relates a of the cdf P(|i?| > x) f=s 
x~ a to the average return interval T q between two consecutive 
volatilities above a threshold q 0T1 . 

(ii) we introduce a joint stochastic process for model- 
ing simultaneously \R\ and \R\, which generates the cross- 
correlations between \R\ and \R\. We also provide conditions 
for stationarity. 
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